Comparison of Three Nonparametric Density Estimation Techniques Using Bayes' Classifiers Applied to Microarray Data Analysis
نویسندگان
چکیده
In this study, we attempt to distinguish between acute myeloid leukemia (AML) and acute lymphoid leukemia (ALL) using microarray gene expression data. Bayes’ classification is used with three different density estimation techniques: Parzen, k nearest neighbors(k-NN), and a new hybrid method, called k-neighborhood Parzen (k-NP), that combines properties of the other two. The classifiers are applied both to single genes and pairs of genes. The highest testing accuracy of the three classifiers were within 3% of one another in both the one-gene and two-gene cases. Our overall best classifier used k-NN with two genes, and achieved a testing accuracy of 94%. A comparison with other published results shows that our methods performed very well, especially considering the low number
منابع مشابه
Statistical Topology Using the Nonparametric Density Estimation and Bootstrap Algorithm
This paper presents approximate confidence intervals for each function of parameters in a Banach space based on a bootstrap algorithm. We apply kernel density approach to estimate the persistence landscape. In addition, we evaluate the quality distribution function estimator of random variables using integrated mean square error (IMSE). The results of simulation studies show a significant impro...
متن کاملOptimal Bayes Classifiers for Functional Data and Density Ratios
Bayes classifiers for functional data pose a challenge. This is because probability density functions do not exist for functional data. As a consequence, the classical Bayes classifier using density quotients needs to be modified. We propose to use density ratios of projections on a sequence of eigenfunctions that are common to the groups to be classified. The density ratios can then be factore...
متن کاملUnited Statistical Algorithms , Lp Comoments , Copula Density , Nonparametric Modeling
In addition to solving problems “retail” (one at a time for one client/collaborator), academic statistics should aim to solve problems “wholesale”(algorithms and computer code that can be applied for many clients). We call this approach to teaching and practice SMART COMPUTATIONAL STATISTICS = united data science algorithms providing methods for Small Data and Big Data. It practices that the go...
متن کاملPerformance Analysis of Machine Learning Techniques in Micro Array Data Classification
Abstract: The development of data-mining applications such as classification has shown the need for Machine Learning algorithms to be applied on large scale data. This paper presents the comparison of different classification techniques and investigates the performance of different classifiers for a set of Micro Array data. The algorithm or methods tested are Naive Bayes Classifier, Support Vec...
متن کاملA Comparative Study of Nonparametric Methods for Pattern Recognition
The applied research discussed in this report determines and compares the correct classification percentage of the nonparametric sign test, Wilcoxon's signed rank test, and K-class classifier with the performance of the Bayes classifier. The performance is determined for data which have Gaussian, Laplacian and Rayleigh probability density functions. The correct classification percentage is show...
متن کامل